Deep Learning (MIT Press Essential Knowledge series) by Kelleher John D
Author:Kelleher, John D. [Kelleher, John D.]
Language: eng
Format: azw3
Tags: Long Short-Term Memory, Neural Networks, Big Data, Deep Learning, Convolutional Neural Network, Machine Learning, Backpropagation, Recurrent Neural Network, Artificial Intelligence
Publisher: The MIT Press
Published: 2019-08-15T16:00:00+00:00
Layer-Wise Pretraining Using Autoencoders
In layer-wise pretraining, the initial autoencoder learns an encoding for the raw inputs to the network. Once this encoding has been learned, the units in the hidden encoding layer are fixed, and the output (decoding) layer is thrown away. Then a second autoencoder is trained—but this autoencoder is trained to reconstruct the representation of the data generated by passing it through the encoding layer of the initial autoencoder. In effect, this second autoencoder is stacked on top of the encoding layer of the first autoencoder. This stacking of encoding layers is considered to be a greedy process because each encoding layer is optimized independently of the later layers; in other words, each autoencoder focuses on finding the best solution for its immediate task (learning a useful encoding for the data it must reconstruct) rather than trying to find a solution to the overall problem for the network.
Once a sufficient number8 of encoding layers have been trained, a tuning phase can be applied. In the tuning phase, a final network layer is trained to predict the target output for the network. Unlike the pretraining of the earlier layers of the network, the target output for the final layer is different from the input vector and is specified in the training dataset. The simplest tuning is where the pretrained layers are kept frozen (i.e., the weights in the pretrained layers don’t change during the tuning); however, it is also feasible to train the entire network during the tuning phase. If the entire network is trained during tuning, then the layer-wise pretraining is best understood as finding useful initial weights for the earlier layers in the network. Also, it is not necessary that the final prediction model that is trained during tuning be a neural network. It is quite possible to take the representations of the data generated by the layer-wise pretraining and use it as the input representation for a completely different type of machine learning algorithm, for example, a support vector machine or a nearest neighbor algorithm. This scenario is a very transparent example of how neural networks learn useful representations of data prior to the final prediction task being learned. Strictly speaking, the term pretraining describes only the layer-wise training of the autoencoders; however, the term is often used to refer to both the layer-wise training stage and the tuning stage of the model.
Figure 4.5 shows the stages in layer-wise pretraining. The figure on the left illustrates the training of the initial autoencoder where an encoding layer (the black circles) of three units is attempting to learn a useful representation for the task of reconstructing an input vector of length 4. The figure in the middle of figure 4.5 shows the training of a second autoencoder stacked on top of the encoding layer of the first autoencoder. In this autoencoder, a hidden layer of two units is attempting to learn an encoding for an input vector of length 3 (which in turn is an encoding of a vector of length 4).
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
Computer Vision & Pattern Recognition | Expert Systems |
Intelligence & Semantics | Machine Theory |
Natural Language Processing | Neural Networks |
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(8309)
Test-Driven Development with Java by Alan Mellor(6773)
Data Augmentation with Python by Duc Haba(6688)
Principles of Data Fabric by Sonia Mezzetta(6435)
Learn Blender Simulations the Right Way by Stephen Pearson(6335)
Microservices with Spring Boot 3 and Spring Cloud by Magnus Larsson(6209)
Hadoop in Practice by Alex Holmes(5965)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(5813)
RPA Solution Architect's Handbook by Sachin Sahgal(5605)
Big Data Analysis with Python by Ivan Marin(5387)
The Infinite Retina by Robert Scoble Irena Cronin(5298)
Life 3.0: Being Human in the Age of Artificial Intelligence by Tegmark Max(5154)
Pretrain Vision and Large Language Models in Python by Emily Webber(4352)
Infrastructure as Code for Beginners by Russ McKendrick(4116)
Functional Programming in JavaScript by Mantyla Dan(4042)
The Age of Surveillance Capitalism by Shoshana Zuboff(3961)
WordPress Plugin Development Cookbook by Yannick Lefebvre(3832)
Embracing Microservices Design by Ovais Mehboob Ahmed Khan Nabil Siddiqui and Timothy Oleson(3632)
Applied Machine Learning for Healthcare and Life Sciences Using AWS by Ujjwal Ratan(3605)
